Speaker normalization for automatic speech recognition - An on-line approach

نویسندگان

  • Ioannis Dologlou
  • Tom Claes
  • Louis ten Bosch
  • Dirk Van Compernolle
  • Hugo Van hamme
چکیده

We propose a method to transform the on line speech signal so as to comply with the specications of an HMM-based automatic speech recognizer. The spectrum of the input signal undergoes a vocal tract length (VTL) normalization based on dierences of the average third formant F3. The high frequency gap which is generated after scaling is estimated by means of an extrapolation scheme. Mel scale cepstral coecients (MFCC) are used along with delta and delta-cepstra as well as delta and delta energy. The method has been tested on the TI digits database which contains adult and kids speech providing substantial gains with respect to non normalized speech.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

تخمین سریع ضرایب پیچش در هنجارسازی طول مجرای صوتی با استفاده از امتیاز به دست آمده از مدلسازی تشخیص جنسیت

The performance of automatic speech recognition (ASR) systems is adversely affected by the variations in speakers, audio channels and environmental conditions. Making these systems robust to these variations is still a big challenge. One of the main sources of variations in the speakers is the differences between their Vocal Tract Length (VTL). Vocal Tract Length Normalization (VTLN) is an effe...

متن کامل

Impact of Vocal Tract Length Normalization on the Speech Recognition Performance of an English Vowel Phoneme Recognizer for the Recognition of Children Voices

Differences in human vocal tract lengths can cause inter speaker acoustic variability in speech signals spoken by different speakers for the same textual version and due to these variations, the robustness of a speaker independent (SI) speech recognition system is affected. Speaker normalization using vocal tract length normalization (VTLN) is an effective approach to reduce the affect of these...

متن کامل

Towards an Intelligent Acoustic Front End for Automatic Speech Recognition: Built-in Speaker Normalization

A proven method for achieving effective automatic speech recognition (ASR) due to speaker differences is to perform acoustic feature speaker normalization. More effective speaker normalization methods are needed which require limited computing resources for real-time performance. The most popular speaker normalization technique is vocal-tract length normalization (VTLN), despite the fact that i...

متن کامل

Speaker Normalization for Improved Automatic Speech Recognition for Digital Libraries

SPEAKER NORMALIZATION FOR IMPROVED AUTOMATIC SPEECH RECOGNITION FOR DIGITAL LIBRARIES Wei Wang Old Dominion University, 2004 Director: Dr. Stephen A. Zahorian The context of the thesis work is the improvement of automatic speech recognition (ASR) for use with digital libraries. First, commonly used multimedia file formats and codecs are surveyed with the objective of identifying those formats t...

متن کامل

Speaker-independent silent speech recognition with across-speaker articulatory normalization and speaker adaptive training

Silent speech recognition (SSR) converts non-audio information (e.g., articulatory information) to speech. SSR has potential to enable laryngectomees to produce synthesized speech with a natural sounding voice. Despite its recent advances, current SSR research has largely relied on speaker-dependent recognition. High degree of variation in articulatory patterns across different talkers has been...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998